Open
Conversation
avvertix
requested changes
Apr 10, 2026
| - [PyMuPDF4LLM](https://pymupdf.readthedocs.io/en/latest/pymupdf4llm/) - Aimed to make it easier to extract PDF content in the format you need for LLM & RAG environments. It supports Markdown extraction as well as LlamaIndex document output. | ||
| - [CatchTheTornado/pdf-extract-api](https://github.com/CatchTheTornado/pdf-extract-api) - Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown. | ||
| - [climatepolicyradar/navigator-document-parser](https://github.com/climatepolicyradar/navigator-document-parser) - Parsing PDFs and websites containing laws and policies. | ||
| - [Iteration Layer](https://iterationlayer.com) - Document extraction API that extracts structured data from PDFs, images, DOCX, and text files using AI. |
Contributor
There was a problem hiding this comment.
I'd like to propose a change in the description to differentiate from the others
Suggested change
| - [Iteration Layer](https://iterationlayer.com) - Document extraction API that extracts structured data from PDFs, images, DOCX, and text files using AI. | |
| - [Iteration Layer](https://iterationlayer.com) - An AI-powered API that extracts structured data from PDFs, images, DOCX, and text files. |
Contributor
|
Thank @fschucht to bring your service to our attention. I just proposed a small change in the description to be included in the list. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds Iteration Layer to the Parsers, OCR and extraction section.
Closes #12
What it does
Iteration Layer is a document extraction API that uses AI to extract structured data from PDFs, images (PNG, JPG, WebP), DOCX, and text files. You define a schema with fields and receive structured JSON output.
Why it fits
The Parsers, OCR and extraction section lists tools like Reducto (Document Ingestion API) and pdf-extract-api. Iteration Layer serves the same purpose — an API for AI-powered structured extraction from PDFs and other documents.